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Abstract 

We study algorithmic problems in multi-stage open shop processing systems that 
are centered around reachability and deadlock detection questions. 

We characterize safe and unsafe system states. We show that it is easy to recognize 
system states that can be reached from the initial state (where the system is empty), 
but that in general it is hard to decide whether one given system state is reachable 
from another given system state. We show that the problem of identifying reachable 
deadlock states is hard in general open shop systems, but is easy in the special case 
where no job needs processing on more than two machines (by linear programming 
and matching theory), and in the special case where all machines have capacity one 
(by graph-theoretic arguments). 
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1 Introduction 



We consider a multi-stage open shop processing system with n jobs Ji,...,J n and m 
machines Mi, . . . ,M m . Every job Jj (j = 1, ...,n) requests processing on a certain 
subset A4(Jj) of the machines; the ordering in which job Jj passes through the machines 
in A4(Jj) is irrelevant and can be chosen arbitrarily by the scheduler. Every machine Mj 
(i = 1, . . . ,m) has a corresponding capacity cap(Mj), which means that at any moment 
in time it can simultaneously hold and process up to cap(Mj) jobs. For more information 
on multi-stage scheduling systems, the reader is referred to the survey [6]. 

In this article, we are mainly interested in the performance of real-time multi-stage 
systems, where the processing time pj^ of job Jj on machine M, is a priori unknown and 
hard to predict. The Central Control (the scheduling policy) of the system learns 
the processing time p, j only when the processing of job Jj on machine Mi is completed. 
The various jobs move through the system in an unsynchronized fashion. Here is the 
standard behavior of a job in such a system: 

1. In the beginning the job is asleep and is waiting outside the system. For technical 
reasons, we assume that the job occupies an artificial machine Mq of unbounded 
capacity. 

2. After a finite amount of time the job wakes up, and starts looking for an available 
machine M on which it still needs processing. If the job detects such a machine 
M, it requests permission from the Central Control to move to machine M. If 
no such machine is available or if the Central Control denies permission, the 
job falls asleep again (and returns to the beginning of Step 2). 

3. If the job receives permission to move, it releases its current machine and starts 
processing on the new machine M. While the job is being processed and while the 
job is asleep, it continuously occupies machine M (and blocks one of the cap(M) 
available places on M). When the processing of the job on machine M is completed 
and in case the job still needs processing on another machine, it returns to Step 2. 

4. As soon as the processing of the job on all relevant machines is completed, the job 
informs the Central Control that it is leaving the system. We assume that 
the job then moves to an artificial final machine M m+ i (with unbounded capacity), 
and disappears. 

The described system behavior typically occurs in robotic cells and flexible manu- 
facturing systems. The high level goal of the Central Control is to arrive at the 
situation where all the jobs have been completed and left the system. Other goals are 
of course to reach a high system throughput, and to avoid unnecessary waiting times 
of the jobs. However special care has to be taken to prevent the system from reaching 
situations of the following type: 
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Example 1.1 Consider an open shop system with three machines Mi, Mi, M3 of capac- 
ity 1. There are three jobs that each require processing on all three machines. Suppose 
that the Central Control behaves as follows: 

The first job requests permission to move to machine M\ . Permission granted. 

The second job requests permission to move to machine M2 ■ Permission granted. 

The third job requests permission to move to machine M3. Permission granted. 
Once the three jobs have completed their processing on theses machines, they keep blocking 
their machines and simultaneously keep waiting for the other machines to become idle. 
The processing never terminates. 

Example 11.11 illustrates a so-called deadlock, that is, a situation in which the system 
gets stuck and comes to a halt since no further processing is possible: Every job in the 
system is waiting for resources that are blocked by other jobs that are also waiting in 
the system. Resolving a deadlock is usually expensive (with respect to time, energy, and 
resources), and harmfully diminishes the system performance. In robotic cells resolving 
a deadlock typically requires human interaction. The scientific literature on deadlocks is 
vast, and touches many different areas like flexible manufacturing, automated production, 
operating systems, Petri nets, network routing, etc. 

The literature distinguishes two basic types of system states (see for instance Coffman, 
Elphick & Shoshani [2], Gold [5], or Banaszak & Krogh [1]). A state is called safe, if 
there is at least one possible way of completing all jobs. A state is called unsafe, if every 
possible continuation eventually will get stuck in a deadlock. An example for a safe state 
is the initial situation where all jobs are outside the system (note that the jobs could 
move sequentially through the system and complete). Another example for a safe state 
is the final situation where all jobs have been completed. An example for an unsafe state 
are the deadlock states. 

Summary of considered problems and derived results 

In this article we study the behavior of safe and unsafe states in open shop scheduling sys- 
tems. In particular, we investigate the computational complexity of the four algorithmic 
questions described in the following paragraphs. First, if one wants to have a smoothly 
running system, then it is essential to distinguish the safe from the unsafe system states: 

Problem: Safe State Recognition 

Instance: An open shop scheduling system. A system state s. 
Question: Is state s safe? 

Section [3] provides a simple characterization of unsafe states, which leads to a (straight- 
forward) polynomial time algorithm for telling safe states from unsafe states. Similar 
characterizations have already been given a decade ago in the work of Sulistyono & 
Lawley [9] and Xing, Lin & Hu [10] . Our new argument is extremely short and simple. 

One of the most basic problems in analyzing a system consists in characterizing those 
system states that can be reached while the shop is running. 
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Problem: Reachable State Recognition 

Instance: An open shop scheduling system. A system state s. 

Question: Can the system reach state s when starting from the initial sit- 
uation where all machines are still empty? 

In Section U] we derive a polynomial time algorithm for recognizing reachable system 
states. The main idea is to reverse the time axis, and to make the system run backward. 
Then reachable states in the original system translate into safe states in the reversed 
system, and the results from Section [3] can be applied. 

Hence recognizing states that are reachable from the initial situation is easy. What 
about recognizing states that are reachable from some other given state? 

Problem: State-to-State Reachability 

Instance: An open shop scheduling system. Two system states s and t. 
Question: Can the system reach state t when starting from state s? 

Surprisingly, there is a strong and sudden jump in the computational complexity of the 
reachability problem: Section [5] provides an NP-hardness proof for problem State-to- 
State Reachability. 

Another fundamental question is whether an open shop system can ever fall into a 
deadlock. In case it cannot, then there are no reachable unsafe states and the Central 
Control may permit all moves right away and without analyzing them; in other words 
the system is fool-proof and will run smoothly without supervision. 

Problem: Reachable Deadlock 
Instance: An open shop scheduling system. 

Question: Can the system ever reach a deadlock state when starting from 
the initial situation? 

Section [6] proves problem Reachable Deadlock to be NP-hard, even for the highly 
restricted special case where the capacity of each machine is at most three and where 
each job requires processing on at most four machines. In Sections [7] and [8] we exhibit two 
special cases for which this problem is solvable in polynomial time: The special case where 
every job needs processing on at most two machines is settled by a linear programming 
formulation and techniques from matching theory. The special case where every machine 
has capacity one is solved by analyzing cycles in certain edge-colored graphs. 

2 Basic definitions 

A state of an open shop scheduling system is a snapshot describing a situation that might 
potentially occur while the system is running. A state s specifies for every job Jj 

• the machine M s (Jj) on which this job is currently waiting or currently being pro- 
cessed, 
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• and the set Ai s (Jj) C Ad(Jj) — {M s (Jj)} of machines on which the job still needs 
future processing. 

The machines M s (Jj) implicitly determine 

• the set J s {Mi) C { J l7 . . . , J n } of jobs currently handled by machine Mj. 

The initial state is the state where all jobs are still waiting for their first processing; in 
other words in the initial state all jobs Jj satisfy M°(Jj) = Mq and At (J,) = M(Jj). 
The final state f is the state where all jobs have been completed; in other words in the 
final state all jobs Jj satisfy M-f (Jj) = M m+ i and (Jj) = 0. 

A state t is called a successor of a state s, if it results from s by moving a single job 
Jj from its current machine M s (Jj) to some new machine in set A4 s (Jj), or by moving a 
job Jj with A4 s (Jj) = from its current machine to M m+ \. In this case we will also say 
that the system moves from s to t. This successor relation is denoted s — > t. A state t 
is said to be reachable from state s, if there exists a finite sequence s = Sq, s%, . . . , Sk = t 
of states (with k > 0) such that Sj_i — > Sj holds for i = 1, . . . , k. A state s is called 
reachable, if it is reachable from the initial state 0. 

Proposition 2.1 Any reachable state s can be reached from the initial state through a 
sequence of at most n + J27=i \M(Jj)\ moves. 

A state is called safe, if the final state / is reachable from it; otherwise the state is 
called unsafe. A state is a deadlock, if it has no successor states and if it is not the final 
state /. 

3 Analysis of unsafe states 

Unsafe states in open shop systems are fairly well-understood, and the literature contains 
several characterizations for them; see for instance Sulistyono &: Lawley [9], Xing, Lin & 
Hu |10| . and Lawley [7]. In this section we provide yet another analysis of unsafe states, 
which is shorter and (as we think) simpler than the previously published arguments. 

A machine M is called full in state s, if it is handling exactly cap(M) jobs. A 
non-empty subset B of the machines is called blocking for state s, 

• if every machine in B is full, and 

• if every job Jj that occupies some machine in B satisfies ^ M s ( Jj) C B. 

Here is a simple procedure that determines whether a given machine Mj is part of a 
blocking set in state s: Let Bq = {Mj}. For k > 1 let Jk be the union of all job sets 
J S (M) with M € Bk-i, and let B^ be the union of all machine sets Ai s (J) with J € Jk . 
Clearly Bq C B\ C • • • C B m -\ = B m . Furthermore machine Mj belongs to a blocking 
set, if and only if B m is a blocking set, if and only if all machines in B m are full. In case 
B m is a blocking set, we denote it by ^ in (Mj) and call it the canonical blocking set for 
machine Mj in state s. The canonical blocking set is the smallest blocking set containing 
M: 
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Lemma 3.1 If machine Mj belongs to a blocking set B in state s, then ffi^ iri (Mj) C B. 

The machines in a blocking set B all operate at full capacity on jobs that in the future 
only want to move to other machines in B. Since these jobs are permanently blocked 
from moving, the state s must eventually lead to a deadlock and hence is unsafe. The 
following theorem shows that actually every deadlock is caused by such blocking sets. 

Theorem 3.2 A state s is unsafe if and only if it has a blocking set of machines. 

Proof: The if-statement is obvious. For the only-if-statement, we classify the unsafe 
states with respect to their distances to deadlock states. The set Uq contains the deadlock 
states. For d > 1, set lid contains all states whose successor states are all contained in 
Ud-i- Note that Ud-i C Ud, and note that every unsafe state occurs in some Ud- We 
prove by induction on d that every state in Ud has a blocking set of machines. For d = 
this is trivial. 

In the inductive step, assume for the sake of contradiction that some state s E Ud is 
unsafe but does not contain any blocking set. Since every move from s leads to a state 
in Ud-i, all successor states of s must contain blocking sets. Whenever in state s some 
job J moves to some (non-full) machine M, this machine M must become full and must 
then be part of any blocking set. Among all possible moves, consider a move that yields 
a state t with a newly full machine M for which the canonical blocking set B^-^M) is 
of the smallest possible cardinality. 

Note that in state t there exist a machine M' E ^ in (M) and a job J' E J l (M') 
with M E _M*(J'); otherwise B l min (M) — {M} would be a blocking set for state s. Now 
consider the successor state u of s that results by moving job J' from machine M to M' . 
Since M U (J') C B^ a (M), a simple inductive argument shows that B^ in (M) C B^ in (M). 
Since job J' has just jumped away from M' , this machine cannot be full in state u, and 
hence M' E i3^ in (M) — <f?^ in (M). Consequently the canonical blocking set £>^ in (M) has 
smaller cardinality than B^- (M). This contradiction completes the proof. □ 

Lemma 3.3 For a given state s, it can be decided in polynomial time whether s has 
a blocking set of machines. Consequently, problem Safe State Recognition can be 
decided in polynomial time. 

Proof: Create an auxiliary digraph that corresponds to state s: the vertices are the ma- 
chines Mi, . . . , M m . Whenever some job Jj occupies a machine Mj, the digraph contains 
an arc from Mj to every machine in A4 s (Jj). Obviously state s has a blocking set of 
machines if and only if the auxiliary digraph contains a strongly connected component 
with the following two properties: (i) All vertices in the component are full, (ii) There 
are no arcs leaving the component. Since the strongly connected components of a digraph 
can easily be determined and analyzed in linear time (see for instance [3]), the desired 
statement follows. □ 
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4 Analysis of reachable states 



In this section we discuss the behavior of reachable system states. We say that a state t is 
subset-reachable from state s, if every job Jj satisfies one of the following three conditions: 

• M l (Jj) = M s (Jj) and M\Jj) = M s {Jj), or 

• M*(Jj) G M s (Jj) and M*(Jj) C A4 s (Jj) - (M*(Jj)}, or 

• Af*(Jj) = M m+l and M*(Jj) = 0. 

Clearly whenever a state i is reachable from some state s, then t is also subset-reachable 
from s. The following example demonstrates that the reverse implication is not necessar- 
ily true. This example also indicates that the algorithmic problem Reachable State 
Recognition (as formulated in the introduction) is not completely straightforward. 

Example 4.1 Consider an open shop system with two machines Mi,M% of capacity 1 
and two jobs J%, J2 with A4(Ji) = A4(J2) = {M\,M2~\- Consider the state s where J\ is 
being processed on M\ and J2 is being processed on Mi, and where A4 S (J\) = A4 S ( J2) = 0- 
It can be seen that s is subset-reachable from the initial state 0, whereas s is not reachable 
from 0. 

Our next goal is to derive a polynomial time algorithm for recognizing reachable 
system states. Consider an open shop scheduling system and a fixed system state s. 
Without loss of generality we assume that s is subset-reachable from the initial state. 
We define a new (artificial) state t where M\Jj) := M s (Jj) and M\Jj) := M(Jj) - 
Ai s (Jj) — {M s (Jj)} for all jobs Jj. Note that in both states s and t every job is sitting 
on the very same machine, but the work that has already been performed in state s is 
exactly the work that still needs to be done in state t. 

Lemma 4.2 State s is reachable if and only if state t is safe. 

Proof: First assume that s is reachable, and let = sq — > si —> s^ = s denote a 

corresponding witness sequence of moves. Define a new sequence t = tk — > ifc-i 
*o = / °f moves: Whenever the move S£ — > sg + i (0 < £ < k — 1) results from moving job 
Jj from machine M a to machine Mb, then the move te+i — > te results from moving job 
Jj from machine M^ to machine M a . (Note that the artificial machines Mq and M m+ i 
switch their roles.) Hence t is safe. A symmetric argument shows that if t is safe then s 
is reachable. □ 

Hence deciding reachability is algorithmically equivalent to deciding safeness. To- 
gether with Lemma 13.31 this yields the following theorem. 

Theorem 4.3 Reachable State Recognition can be decided in polynomial time. □ 
The following lemma states a simple sufficient condition that makes a state reachable. 
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Lemma 4.4 Let s be a state, and let K be a subset of machines such that every job that 
still needs further processing in s satisfies M s (Jj) E K, and 

M a (Jj)u{M'(jj)} = KnM(Jj). 

Then s is a reachable system state. 

Proof: By renaming the jobs we assume that the jobs Jj with 1 < j < k have M s (Jj) = 
M m+ i and the jobs Jj with k + 1 < j < n have M s (Jj) G /C. We handle the jobs one by 
one in their natural order: every job moves through all machines in M.{Jj) — A4 s (Jj), 
and ends up on machine M s (Jj). Then the next job is handled. □ 

5 Analysis of state-to-state reachability 

We establish NP-hardness of State-to-State Reachability by means of a reduction 
from the following satisfiability problem; see Garey Sz Johnson [4]. 

Problem: Three-Satisfiability 

Input: A set X = {xi, . . . , x n } of n logical variables; a set C = {ci, . . . , c m } 
of m clauses over X that each contain three literals. 

Question: Is there a truth assignment for X that satisfies all clauses in C? 

We start from an instance of Three-Satisfiability, and construct a corresponding 
instance of State-to-State Reachability for it. Throughout we will use li to denote 
the unnegated literal Xi or the negated literal xl for some fixed variable Xi G X, and we 
will use I to denote a generic literal over X. Altogether there are 5n + m machines: 

• For every literal £i, there are three corresponding machines S(£i), T(£j), and U(£i). 
Machine U(£i) has capacity 2, whereas machines S(£i) and T{£j) have capacity 1. 
For every variable Xi £ X the two machines U (xi) and U (xj) coincide, and the 
corresponding machine will sometimes simply be called U(i). 

• For every clause cj € C, there is a corresponding machine V(cj) with capacity 3. 

Furthermore the scheduling instance contains 4n jobs that correspond to literals and 6m 
jobs that correspond to clauses. For every literal £{ there are two corresponding jobs: 

• Job J{£i) is sitting on machine S{£i) in state s. In state t it has moved to machine 
U(£i) without visiting other machines inbetween. 

• Job J'(£i) is still waiting outside the system in state s, and has already left the 
system in state t. Inbetween the job visits machines S(£i), T(£i), U{£i) in arbitrary 
order. 

Consider a clause Cj that consists of three literals £ a ,£b,£ c . Then the following six jobs 
correspond to clause Cj\ 
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• For £ 6 {£ a , if,, £ c } there is a job K(cj,£) that in state s sits on machine ^(cj), then 
moves through machines S(£) and T{£) in arbitrary order, and finally has left the 
system in state t. Note that in state s these three jobs block machine V{cj) to full 
capacity. 

• For £ G {£ a ,£b,^c} there is another job K'(cj,£) that waits outside the system in 
state s, then moves through machines U{£) and V(cj) in arbitrary order, and finally 
has left the system in state t. 

In Sections 15. II and 15. 21 we will show that in the constructed scheduling instance state t is 
reachable from state s if and only if the Three-Satisfiability instance has a satisfying 
truth assignment. This then implies the following theorem. 

Theorem 5.1 State-to-State Reachability is NP-complete. 
5.1 Proof of the if-statement 

We assume that the Three-Satisfiability instance has a satisfying truth assignment. 
We describe a sequence of moves that brings the scheduling system from the starting 
state s into the goal state t. 

In a first phase, for every true variable X{ the job J(xj) moves from machine S{xi) 
to machine U(i). Then job J'(xj) enters the system by moving to U(i), then moves to 
T(xi), then to S(xi), and finally leaves the system. Next job J'(xj) enters the system, 
moves to U(i), and finally sits and waits on T(xj). Symmetric moves (with the roles 
of Xi and interchanged) are performed for every false variable Xj. At the end of this 
phase, for every true literal £i the two machines S{£i) and T(£i) are empty, and there is 
an empty spot on machine U(£i). 

In the second phase, we consider clauses Cj that consist of three literals £ a ,£b,£ c - We 
pick one true literal £i from Cj, and we let the corresponding job K(cj,£i) jump away 
from machine V{cj) to machine S(£i), then to T(£i), and finally make it leave the system. 
This yields a free spot on machine V(cj). For every £ G {£ a ,£b,^c} we let job K'(cj,£) 
enter the system, move through machines U{£) and V(cj), and then leave the system. At 
the end of this phase, four out of the six jobs corresponding to every clause have reached 
their final destination in state t. 

In the third phase, for every true variable Xi the job J(~x~i) moves from machine S(xi) 
to machine U(i). Job J'(xJ) moves from T{xt) to S(~x~i), and then leaves the system. 
Symmetric moves (with the roles of Xi and xl interchanged) are performed for every false 
variable Xj. At the end of this phase, all jobs J(£i) and J'{£-i) have reached their final 
destination in state t. All machines S(£i) and T(£i) are empty. 

In the fourth phase, we again consider clauses Cj that consist of three literals £ a ,£b, ^c- 
For the two literals £ in Cj that did not get picked in the second phase, we move the 
corresponding job K(cj,£) from machine V{cj) to machine S(£), then to machine T(£), 
and finally make it leave the system. At the end of this phase all jobs have reached their 
final destination, and the system has reached the desired goal state t. 
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5.2 Proof of the only-if-statement 

We assume that there is a sequence of moves that brings the scheduling system from 
state s into state t. We will deduce from this a satisfying truth assignment for the 
Three- Satisfiability instance. 

We say that variable Xi is activated as soon as one of the corresponding jobs J(xi) 
and J(xj) moves to machine U(i). We say that Xi is deactivated at the moment /ij in 
time where also the other job J(xi) or J(xj) moves to machine U(i). If job J(xj) moves 
first and activates Xi, we set variable Xi to true; if J(x?) moves first and activates Xi, we 
set variable Xi to false. We will show that the resulting truth setting satisfies all clauses. 

Lemma 5.2 If li is a false literal, then job K{cj,£i) can visit machine T(ti) only after 
the deactivation time fa of variable Xi. 

Proof: Till the crucial moment ^ where variable Xi is deactivated, job J(£i) is perma- 
nently blocking machine S(£i). From time Hi onwards, jobs J(xi) and J(xl) together are 
permanently blocking the machine U (i) with capacity 2. 

Suppose for the sake of contradiction that some job K{cj,£i) moves to machine T{£i) 
before moment /Ltj. Then at time /ij, it is waiting for its final processing on machine 
S{£i) and blocking machine T{£i). We claim that under these circumstances job J'(£i) is 
causing trouble: In case J'(li) has not yet entered the system at time fj,i, it can never be 
processed on machine U{i) which is permanently blocked from time \X{ onwards. In case 
J' {li) has already entered the system at time //j, then at time Hi it must be sitting on 
machine U{i) and thereby prevents job J{£i) from moving there. In either case we reach 
a contradiction. □ 

Now let us consider some arbitrary clause Cj that consists of three literals £ a ,£t>,£ c , let 
x a ,Xb,x c be the three underlying variables in X, and assume without loss of generality 
that the corresponding moments of deactivation satisfy fj, a < fib < fi c - 

Lemma 5.3 At time fi a job K'(cj,£ a ) must either be sitting on machine V{cj), or must 
have left the system. 

Proof: If at time fi a job K'(cj,£ a ) is still waiting outside the system, then it will never 
be processed on machine U(a), which is permanently blocked by jobs J(x a ) and J(x£). 
Hence there is no way of reaching state t, which is a contradiction. If at time fi a job 
K'(cj,£ a ) is sitting on machine U(a), it thereby prevents variable x a from being deacti- 
vated. That's another contradiction. □ 

Hence at time \x a job K'{cj,£ a ) must already have visited machine V(cj). Since in 
the starting state s the three jobs K{cj,£ a ), K{cj,£b), K{cj,l c ) are blocking V(cj), one 
of them must have made space and must have moved away before time fi a ; let this job 
be K(cj,£i) where i € {a,b,c}. We distinguish two cases. First, assume that K(cj,£i) 
has moved to machine S{£i). Since variable Xi is still active, literal £i must be true in 
this case. Secondly, assume that K(cj,£{) has moved to machine T{li). Since variable 
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is still active, Lemma 15.21 yields that literal l{ is true. In either case, clause Cj contains 
the true literal £{. 

We conclude that every clause contains some true literal, and that the defined truth 
setting satisfies all clauses. This completes the proof of Theorem 15.11 

6 Analysis of reachable deadlocks 

In this section we show that Reachable Deadlock is an NP-hard problem. Our 
reduction is from the following variant of the Three-Dimensional Matching problem; 
see Garey & Johnson [U p. 221]. 

Problem: Three-Dimensional Matching 

Instance: An integer n. Three pairwise disjoint sets A = {a±, . . . ,a n }, 
B = {b\, . . . , b n }, and C = {c\, . . . , c n }. A set T C A x B x C of triples, such 
that every element occurs in at most three triples in T. 

Question: Does there exist a subset T' C T of n triples, such that every 
element in A U B U C occurs in exactly one triple in T'l 

We start from an arbitrary instance of Three-Dimensional Matching, and construct 
the following corresponding instance of Reachable Deadlock for it. There are two 
types of machines. Note that every machine has capacity at most three. 

• There are n + 2 so-called structure machines So, ... , <S n +i> each of capacity 1. 

• For every triple t G T, there is a corresponding triple machine Tt with capacity 3. 

Furthermore there are An + 2 jobs. 

• For every element a% G A there are two corresponding A- element jobs J + (aj) and 
J~{a,i). Job J + {ai) requires processing on structure machine Si, and on every triple 
machine T t with cjj G t. Job J~(a.{) requires processing on structure machine Si_i, 
and on every triple machine Tt with aj G t. 

• For every element 6j G B there is a corresponding B-element job J(bi) that requires 
processing on structure machine S n +\, and on every triple machine Tt with bi G t. 

• For every element a G C there is a corresponding C- element job J(cj) that requires 
processing on structure machine S n +\, and on every triple machine Tt with Cj G t. 

• Finally there is a dummy job Dq that needs processing on So and 5 n+ i, and another 
dummy job D n+ \ that needs processing on S n and S n+ \. 

Since every element of A U B U C occurs in at most three triples, we note that each job 
requires processing on at most four machines. For the ease of later reference, we also list 
for every machine the jobs that need processing on that machine. 

• A triple machine Tt with t = (a,, bj,Ck) handles the four jobs J + (oj), J _ (aj), J(bj), 
and J(cfe). 
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• Structure machine Si with 1 < i < n — 1 handles the jobs J + (aj) and J _ (aj + i). 
Structure machine So handles the two jobs J^(ai) and Do- 
Structure machine S n handles the two jobs J + (a n ) and D n+ i. 
Structure machine SVi+l handles 2n + 2 jobs: Dq, D n+ \, all B-element jobs, and all 
C-element jobs. 

The following theorem contains the main result of this section. 

Theorem 6.1 Reachable Deadlock is NP- complete, even if the capacity of each 
machine is at most three, and if each job requires processing on at most four machines. 

Indeed, Proposition 12 . 1 1 yields an NP-certificate for problem Reachable Deadlock. 
The hardness argument proves that the constructed scheduling instance has a reachable 
deadlock if and only if the Three-Dimensional Matching instance has answer YES. 
The only-if-statement will be proved in Section [6.11 and the if-statement will be proved 
in Section [UT21 

6.1 Proof of the only-if-statement 

We assume that the scheduling instance has a reachable deadlock state s, and we will 
show that then the Three-Dimension al Matching instance has answer YES. 

Let B* be a blocking set of minimum cardinality in s, and let J* denote the jobs that 
are currently being processed on machines in B* . For every triple machine T% in B* , the 
job set J* contains all four element jobs that need processing on Tt- (First: Machine 
Tt must be full and hence must process three jobs. Second: If no other job in J* needs 
processing on Tt, then B* — {Tt} would yield a smaller blocking set.) Similarly, for every 
structure machine Si € B* with < i < n, the job set J* contains both jobs that need 
processing on Si. 

Lemma 6.2 The blocking set B* contains at least one of the structure machines Si with 
0<i<n. 

Proof: Suppose otherwise. Then B* consists solely of triple machines and perhaps of 
machine S n+ \. 

We first claim that every triple machine in B* processes exactly one A-element job, 
one B-element job, and one C-element job. Indeed, there is an A-element job J £ J* 
that corresponds to some element «j € A and that is processed on some triple machine 
T € B*. The machine set A4 S (J) of this job contains another triple machine T u € B*. 
Then a, G t and a, G u, and both machines T and T u must be processing one A-element 
job (that corresponds to element Oj), one B-element job, and one C-element job. This 
established the claim. 

Next fix a B-element job J (pi) E J* that is processed on some machine T in B*. The 
machine set M s (J(bi)) contains yet another machine from B* . This cannot be a triple 
machine T v £ B*. (Every such machine T v is processing another B-element jobs J(bj) 
with j i, which implies bi £ u). Hence J (pi) needs future processing on the structure 
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machine S n+ i, and S n+ i £ B*. Then S n+ i must be blocked by some job that needs future 
processing on some other machine in B* . But neither Dq, nor D n+ \, nor any B-element 
or C-element job can do that. □ 

Lemma 6.3 (i) Let Si £ B* with 1 < i < n, and let job J + (ai) be running on Si. Then 
there exists exactly one triple machine Tt £ B* with ai £ t, and this machine is processing 
job J~(ai). Furthermore Si-i £ B* . 

(ii) Let Si-i £ B* with 1 < i < n, and let job J~(ai) be running on Si—i. Then there 
exists exactly one triple machine T t £ B* with ai £ t, and this machine is processing job 
J + (aj). Furthermore Si £ B* . 

Proof: As the statements (i) and (ii) are symmetric, we only discuss (i). Consider job 
J + (aj) on machine <Sj. Since / A4 s (J + (ai)) C B*, we conclude that J + («i) still needs 
to be processed on a triple machine T t £ B*, say with t = (ai,bj,Ck). Since T t is full, it 
must be processing the three jobs J~{ai), J(bj), and J(cfc). Then none of the remaining 
triple machines T u with ai £ u can be full, and hence none of them can be in B*. 

The job J~{ai) € J* is running on Tt £ B* and still needs future processing on 
another machine in B*. The only remaining candidate for this machine is Si—i. □ 

Lemma 6.4 The blocking set B* either contains machine Sq which is busy with Dq £ J* , 
or machine S n which is busy with D n+ i £ J* . In either case, the blocking set B* contains 
the structure machine S n +i. 

Proof: Lemma f6. 21 yields that S r £ B* for some r with < r < n. First assume 1 < r < n 
and that S r is busy with J + (o r ). Then an inductive argument based on Lemma HT3l (i) 
yields Si £ B* for < i < r. Moreover for 1 < i < r machine Si is busy with J + (aj), 
and finally Sq £ B* must be busy with Dq. Next assume < r < n — 1 and that S r is 
busy with J~(a r +i). Then a symmetric argument based on Lemma 16.31 (ii) yields that 
machine S n £ B* is busy with D n+ \. This establishes the first part of the lemma. 

If Sq € B* is busy with Dq, then Dq £ J* requires future processing on another 
machine in B* , which must be S n+ i. If S n £ B* is busy with D n+ \, then D n+ \ £ J* 
requires future processing on another machine in B*, which must be . In either case 
this yields the second part of the lemma. □ 

From now on we will assume that machine Sq £ B* is busy with Dq. (The case where 
S n £ B* is busy with D n+ \ can be settled in a symmetric way.) We distinguish two cases 
on the job running on 5 n+ i. 

(Case 1) Assume that S n+ i is busy with D n+ \ £ J*. Then D n+ \ is waiting for 
another machine in B* , which must be machine S n that is busy with J + (a n ). We claim 
for 1 < i < n that J + («j) is processed on machine Si £ £>*, and that job J~(ai) is 
processed on a triple machine Tt £ B* with £ t. The claim is proved by a simple 
inductive argument based on Lemma [6.31 (i) starting with i = n and going down to i = 1. 
Then every triple machine in B* processes one of the jobs J~(a\), . . . , J~(a n ), one B- 
element job, and one C-element job. These n triple machines induce a solution for the 
Three-Dimensional Matching instance. 
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(Case 2) Assume that S n+ i is busy with a B-element job or a C-element job; without 
loss of generality we assume that it is busy with a B-element job J(b r ). Then J(b r ) 
is waiting for another machine in B* , which must be a full triple machine M t with 
t = (dj,b r ,Ck)- Machine M t is busy with the three jobs J~(dj), J + (aj), and J(cfe). 

(i) Job J~(cij) is waiting for a full machine in B* . If j > 2, then this must be the 
structure machine Sj-i which is processing job J + (aj^i) (and if j = 1, then it 
is the structure machine Sq which is processing job -Do)- An inductive argument 
based on Lemma [6.31 (i) yields that for 1 < i < j — 1 job J + (aj) is processed on 
machine Si G B*, and job J~(aj) is processed on a triple machine T t £ B* with 
di £ i. 

(ii) Also job J + (a,j) is waiting for a full machine in B*. If j < n — 1, then this must be 
the structure machine Sj which is processing job J _ (aj+i) (and if j = n, then it is 
the structure machine S n which is processing job D n+ i). An inductive argument 
based on Lemma [6731 (ii) yields that for j + 1 < i < n job J~(ai) is processed on 
machine Si-i G B* and job J + (aj) is processed on a triple machine T t G £>* with 
di G £. 

Now the j — 1 triple machines in (i), the n — j triple machines in (ii), and the triple 
machine Mt with t = (aj,b r , Ck) together induce a solution for the Three-Dimensional 
Matching instance. This completes the analysis of Case 2, and it also completes the 
proof of the only-if-statement. 

6.2 Proof of the if-statement 

We assume that the Three-Dimensional Matching instance has a solution T' C T, 
and from this we will derive a reachable deadlock state for the scheduling instance. 

Consider the subset /C = {T t : t G T'}U{Si : < i < n+1} of machines. We construct 
a state t where every job J has already entered the system, has already been processed 
on all machines in Ad{J) — /C, and is currently being processed on its first machine from 
A4( J) n /C. Hence the assignment of jobs to machines determines the entire state t. We 
assign job Dq to machine So, and job D n+ \ to structure machine S n+ \. For every triple 
t = (ai,bj,Ck) G T', we assign the three jobs J~(aj), J(bj), J(cfc) to triple machine T t , 
and we assign job J + (ai) to triple machine S{. 

The resulting state t has K, as blocking set and is in deadlock. Furthermore Lemma f4.4l 
shows that t is a reachable state. All in all, this yields a reachable deadlock state t. 

7 Reachable deadlocks if jobs require two machines 

Throughout this section we only consider open shop systems where |A4(J)| = 2 holds 
for all jobs J. We introduce for every job J and for every machine M G -M(J) a corre- 
sponding real variable x(J,M), and for every machine M a corresponding real variable 
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y(M). Our analysis is centered around the following linear program (LP): 
min YIm max{y(M), cap(M)} 

s -t- Ylj-MeM(J) X (J,M) = y(M) for all machines M 

T,MeM{J) x(J,M) = 1 for all jobs J 

x(J, M) > for all J and M G M(J) 

Although this linear program is totally unimodular, we will mainly deal with its fractional 
solutions. 

Lemma 7.1 One can compute in polynomial time an optimal solution for the linear 
program (LP) that additionally satisfies the following property (*) for every job J with 
M(J) = {M a ,M b }: Ify(M a ) > cap(M a ) and x(J,M a ) > 0, then y(M b ) > cap(M 6 ). 

Proof: We determine in polynomial time an optimal solution of (LP). Then we perform 
a polynomial number of post-processing steps on this optimal solution, as long as there 
exists a job violating property (*). In this case y(M a ) > cap(M a ), x(J,M a ) > 0, and 
y(M b ) < cap(M 6 ). 

The post-processing step decreases the values x(J,M a ) and y(M a ) by some e > 0, 
and simultaneously increases x(J, M b ) and y{M b ) by the same e. By picking e smaller 
than the minimum of cap(M^) — y(M b ) and x(J,M a ) this will yield another feasible 
solution for (LP). What happens to the objective value? If y(M a ) > cap(M a ) at the 
beginning of the step, then the step would decrease the objective value, which contradicts 
optimality. If y(M a ) = cap(M a ) at the beginning of the step, then the step leaves the 
objective value unchanged, and yields another optimal solution with y(M a ) < cap(M a ) 
and y(M b ) < cap(M b ). 

To summarize, every post-processing step decreases the number of machines M with 
y(M) = cap(M). Hence the entire procedure terminates after at most m steps. □ 

Let x*(J, M) and y*(M) denote an optimal solution of (LP) that satisfies the property 
(*) in Lemma [7TTI Let M* be the set of machines M with y*{M) > cap(M). 

Lemma 7.2 The open shop system has a reachable deadlock, if and only if M* ^ 0. 

Proof: (Only if). Consider a reachable deadlock state, let B' be the corresponding block- 
ing set of machines, and let J 1 be the set of jobs waiting on these machines. Every job 
J € J' is sitting on some machine in B', and is waiting for some other machine in B'. 
Since \M{J)\ = 2, this implies M{J) C B' for every job J G J'. Then 

Y, y*( M ) > E E **( J < M ) = 1^1- 

AfeB' Jej' MeM{J) 

Since furthermore \J'\ = J2mgB' cap(M), we conclude y*(M) > cap(M) for at least one 
machine M G B' . 
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(If). Let J* be the set of jobs with x*(J,M) > for some M € M* . Property (*) 
in Lemma 17.11 now yields the following for every job J: If J € J* , then A4(J) Q Ai*. 
Construct a bipartite graph G between the jobs in J* and the machines in Ai* , with an 
edge between J and M if and only if M E Ai(J). For any subset Ad' C Ai*, the number 
of job neighbors in this bipartite graph is at least ^Me.M' ^*(-^) — Sa/g»' ca P(M). 
A variant of Hall's theorem from matching theory [8] now yields that there exists an 
assignment of some jobs from J* to machines in Ai* such that every M £ Ai* receives 
cap(M) pairwise distinct jobs. 

To reach a deadlock, we first send all non-assigned jobs one by one through the 
system. They are completed and disappear. Then the assigned jobs enter the system, 
each moving straightly to the machine to which it has been assigned. Then the system 
falls into a deadlock with blocking set Ai*: All machines in Ai* are full, and all jobs are 
only waiting for machines in Ai*. □ 

Since jobs J with |.M(J)| = 1 are harmless and may be disregarded with respect to 
deadlocks, we arrive at the following theorem. 

Theorem 7.3 For open shop systems where each job requires processing on at most two 
machines, Reachable Deadlock can be solved in polynomial time. □ 

The following example illustrates that the above LP-based approach cannot be carried 
over to the case where every job requires processing on three machines (since the only-if 
part of Lemma 17.21 breaks down) . 

Example 7.4 Consider a system with two jobs and four machines of unit capacity. Job 
J\ needs processing on Mi, Mi, M3, and job J2 needs processing on Mi, M2, M4. A (reach- 
able) deadlock results if Ji enters the system on M3 and then moves to Mi, whereas J% 
simultaneously enters the system on M4 and then moves to M%. 

We consider a feasible solution with x(J, M) = 1/3 for every J and every M G A4(J), 
and y{Mi) = y{M2) = 2/3 and y(M^) = y(M±) = 1/3. The objective value is 4, and 
hence this is an optimal solution. The post-processing leaves the solution untouched, and 
the resulting set Ai* is empty. 

8 Reachable deadlocks if machines have unit capacity 

Throughout this section we only consider open shop systems with cap (Mj) = 1. For each 
such system we define a corresponding undirected edge-colored multi-graph G = (V,E): 
The vertices are the machines Mi, . . . , M m . Every job Jj induces a clique of edges on 
the vertex set AA(Jj), and all these edges receive color Cj. Intuitively, if two machines 
are connected by an edge e of color Cj, then job Jj may move between these machines 
along edge e. 

Lemma 8.1 For an open shop system with unit machine capacities and its corresponding 
edge-colored multi-graph the following two statements are equivalent. 
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(i) The multi-graph contains a simple cycle whose edges have pairwise distinct colors. 

(ii) The system can reach a deadlock. 

Proof: Assume that (i) holds, and consider a simple cycle C whose edges have pairwise 
distinct colors. By renaming jobs and machines we may assume that the vertices in C 
are the machines Mi, . . . , M&, and that the edges in C are [Mj, Mf+i] with color Cj for 
1 < j < k — 1, and [Mk,M{\ with colors Consider the following processing order of 
the jobs: 

• In the first phase, the jobs Jj with k + l < j <n are processed one by one: Job Jj + \ 
only enters the system after job Jj has completed all its processing and has already 
left the system. At the end of this phase we are left with the jobs J\, . . . , 

• In the second phase, the jobs J\, . . . , </& are handled one by one. When job Jj is 
handled, first all operations of Jj on machines Mj with i > k + 1 are processed. 
Then job Jj moves to machine Mj, and stays there till the end of the second phase. 
Then the next job is handled. 

At the end of the second phase, for 1 < i < k job Jj is blocking machine Mj, and waiting 
for future processing on some other machine in cycle C. The system has fallen into a 
deadlock, and hence (i) implies (ii). 

Next assume that (ii) holds, and consider a deadlock state. For every waiting job 
Jj in the deadlock, let Mj be the machine on which Jj is currently waiting and let Mj' 
denote one of the machines for which the job is waiting. Consider the sub-graph of G 
that for every waiting job Jj contains the vertex Mj together with an edge [M'^M 1 -] of 
color Cj. This sub-graph has as many vertices as edges, and hence must contain a simple 
cycle; hence (ii) implies (i). □ 

Lemma 8.2 For the edge-colored multi-graph G = (V, E) corresponding to some open 
shop system with unit machine capacities, the following three statements are equivalent. 

(i) The multi-graph contains a simple cycle whose edges have pairwise distinct colors. 

(ii) The multi-graph contains a 2-vertex- connected component that spans edges of at 
least two different colors. 

(Hi) The multi-graph contains a simple cycle whose edges have at least two different 
colors. 

Proof: We show that (i) implies (ii) implies (iii) implies (i). The implication from (i) to 
(ii) is straightforward. 

Assume that (ii) holds, and consider a vertex v in such a 2-vertex-connected com- 
ponent that is incident to two edges with two distinct colors. These two edges can be 
connected to a simple cycle, and we get (iii). 

Assume (iii) , and consider the shortest cycle C whose edges have at least two different 
colors. If two edges [u,u'\ and [v,v'\ on C have the same color Cj, then the vertices 
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u,u',v,v' are all in the machine set Ai(Jj) of job Jj. Hence they span a clique in color 
Cj, and some edges in this clique can be used to construct a shorter cycle with edges of 
at least two different colors. This contradiction shows that (hi) implies (i). □ 

Lemmas 18. II and 18.21 together yield that an open shop system can fall into a deadlock 
state if and only if the corresponding multi-graph contains a 2-vertex-connected com- 
ponent that spans edges of at least two different colors. Since the 2-vertex-connected 
components of a graph can easily be determined and analyzed in linear time (see for 
instance [13]), we arrive at the following theorem. 

Theorem 8.3 For open shop systems with unit machine capacities, problem Reachable 
Deadlock can be solved in polynomial time. □ 

References 

[1] Z.A. Banaszak and B.H. Krogh (1990). Deadlock avoidance in flexible manu- 
facturing systems with concurrently competing process flows. IEEE Transactions on 
Robotics and Automation 6, 724-734. 

[2] E.G. Coffman, M.J. Elphick, and A. Shoshani (1971). System Deadlocks. ACM 
Computing Surveys 3, 67-78. 

[3] T.H. Cormen, C.E. Leiserson, R.L. Rivest, and C. Stein (2001). Introduction 
to Algorithms. MIT Press. 

[4] M.R. Garey and D.S. Johnson (1979). Computers and Intractability: A Guide to 
the Theory of NP- Completeness. Freeman, San Francisco. 

[5] M. Gold (1978). Deadlock prediction: Easy and difficult cases. SIAM Journal on 
Computing 7, 320-336. 

[6] E.L. Lawler, J.K. Lenstra, A.H.G. Rinnooy Kan, and D.B. Shmoys (1993). 
Sequencing and scheduling: Algorithms and complexity. In: Handbooks in Operations 
Research and Management Science, Vol. 4, North Holland, 445-522. 

[7] M. Lawley (1999). Deadlock avoidance for production systems with flexible routing. 
IEEE Transactions on Robotics and Automation 15, 497-510. 

[8] L. LovAsz and M.D. Plummer (1986). Matching Theory. Annals of Discrete Math- 
ematics 29, North-Holland. 

[9] W. Sulistyono and M. Lawley (2001). Deadlock avoidance for manufacturing 
systems with partially ordered process plans. IEEE Transactions on Robotics and 
Automation 17, 819-832. 



18 



[10] K. Xing, F. Lin, and B. Hu (2001). An optimal deadlock avoidance policy for man- 
ufacturing system with flexible operation sequence and flexible routing. Proceedings of 
the 2001 IEEE International Conference on Robotics and Automation (ICRA'2001), 
3565-3570. 



19 



